Serveur d'exploration sur la TEI

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

GODDAG: A Data Structure for Overlapping Hierarchies

Identifieur interne : 000217 ( Main/Exploration ); précédent : 000216; suivant : 000218

GODDAG: A Data Structure for Overlapping Hierarchies

Auteurs : Michael Sperberg-McQueen [États-Unis] ; Claus Huitfeldt [Norvège]

Source :

RBID : ISTEX:1A6F6F3249B03EFD8B9CE8C5325578432CA83E2C

Abstract

Abstract: Notations like SGML and XML represent document structures using tree structures; while this is in general a step forward from earlier systems, it creates certain difficulties for the representation of documents in which the structures of interest are not properly nested. Overlapping structures, discontinuous structures, and material which occurs in different orders in different parts, views, or versions of a document are all problems for SGML and XML. Overlapping structures have received attention from a variety of authors on SGML and XML, who have proposed various solutions including the use of non-SGML notations with translation into SGML for processing, the use of the concur feature of SGML, exploitation of conditional marked sections in the DTD and document instance, the imposition of various kinds of unusual interpretations on SGML/XML elements as milestones or as fragments of some larger ‘virtual’ element, or the use of detailed annotation separate from the base text being annotated. An alternative is the use of a non-SGML/XML notation which does not require that elements form a hierarchical structure. One such notation, MECS, was developed by one of the authors and has been used in practice for over a decade. Unfortunately, the element structure of a MECS document cannot conveniently be represented as a tree, so that MECS processors lack the assistance provided to SGML/XML processors by the unifying assumption of a simple standard data structure for the document. We propose a data structure for representing documents with overlapping structures (including MECS documents). As in the conventional tree representation of SGML and XML, elements are represented by nodes in a graph, and the character data content of the document by labels on the leaves of the graph. We use a directed acyclic graph in which an arc a → b indicates that node b is a child of node a. Unlike SGML/XML trees, our graph structure allows children to have multiple parents. In the general form of the data structure, an ordering is imposed on the children of each node; this gives the data structure its name: general ordered-descendant directed acyclic graph (GODDAG). A restricted form of GODDAG, in which an ordering is imposed on the leaves of the graph, cannot handle multiple orderings of the same material but can represent any legal MECS document. The data structure here proposed should be useful in the representation of naturally occurring documents with complex structures; it may also be useful in other applications.

Url:
DOI: 10.1007/978-3-540-39916-2_12


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">GODDAG: A Data Structure for Overlapping Hierarchies</title>
<author>
<name sortKey="Sperberg Mcqueen, M" sort="Sperberg Mcqueen, M" uniqKey="Sperberg Mcqueen M" first="M." last="Sperberg-Mcqueen">Michael Sperberg-McQueen</name>
<affiliation>
<country>États-Unis</country>
<placeName>
<settlement type="city">Cambridge (Massachusetts)</settlement>
<region type="state">Massachusetts</region>
</placeName>
<orgName type="university" n="3">Massachusetts Institute of Technology</orgName>
</affiliation>
</author>
<author>
<name sortKey="Huitfeldt, Claus" sort="Huitfeldt, Claus" uniqKey="Huitfeldt C" first="Claus" last="Huitfeldt">Claus Huitfeldt</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:1A6F6F3249B03EFD8B9CE8C5325578432CA83E2C</idno>
<date when="2004" year="2004">2004</date>
<idno type="doi">10.1007/978-3-540-39916-2_12</idno>
<idno type="url">https://api.istex.fr/document/1A6F6F3249B03EFD8B9CE8C5325578432CA83E2C/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000284</idno>
<idno type="wicri:Area/Istex/Curation">000284</idno>
<idno type="wicri:Area/Istex/Checkpoint">000177</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000177</idno>
<idno type="wicri:doubleKey">0302-9743:2004:Sperberg Mcqueen M:goddag:a:data</idno>
<idno type="wicri:Area/Main/Merge">000235</idno>
<idno type="wicri:Area/Main/Curation">000217</idno>
<idno type="wicri:Area/Main/Exploration">000217</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">GODDAG: A Data Structure for Overlapping Hierarchies</title>
<author>
<name sortKey="Sperberg Mcqueen, M" sort="Sperberg Mcqueen, M" uniqKey="Sperberg Mcqueen M" first="M." last="Sperberg-Mcqueen">Michael Sperberg-McQueen</name>
<affiliation>
<country>États-Unis</country>
<placeName>
<settlement type="city">Cambridge (Massachusetts)</settlement>
<region type="state">Massachusetts</region>
</placeName>
<orgName type="university" n="3">Massachusetts Institute of Technology</orgName>
</affiliation>
<affiliation>
<wicri:noCountry code="no comma">E-mail: cmsmcq@acm.org</wicri:noCountry>
<country>États-Unis</country>
<placeName>
<settlement type="city">Cambridge (Massachusetts)</settlement>
<region type="state">Massachusetts</region>
</placeName>
<orgName type="university" n="3">Massachusetts Institute of Technology</orgName>
</affiliation>
</author>
<author>
<name sortKey="Huitfeldt, Claus" sort="Huitfeldt, Claus" uniqKey="Huitfeldt C" first="Claus" last="Huitfeldt">Claus Huitfeldt</name>
<affiliation></affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Norvège</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2004</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">1A6F6F3249B03EFD8B9CE8C5325578432CA83E2C</idno>
<idno type="DOI">10.1007/978-3-540-39916-2_12</idno>
<idno type="ChapterID">12</idno>
<idno type="ChapterID">Chap12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass></textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: Notations like SGML and XML represent document structures using tree structures; while this is in general a step forward from earlier systems, it creates certain difficulties for the representation of documents in which the structures of interest are not properly nested. Overlapping structures, discontinuous structures, and material which occurs in different orders in different parts, views, or versions of a document are all problems for SGML and XML. Overlapping structures have received attention from a variety of authors on SGML and XML, who have proposed various solutions including the use of non-SGML notations with translation into SGML for processing, the use of the concur feature of SGML, exploitation of conditional marked sections in the DTD and document instance, the imposition of various kinds of unusual interpretations on SGML/XML elements as milestones or as fragments of some larger ‘virtual’ element, or the use of detailed annotation separate from the base text being annotated. An alternative is the use of a non-SGML/XML notation which does not require that elements form a hierarchical structure. One such notation, MECS, was developed by one of the authors and has been used in practice for over a decade. Unfortunately, the element structure of a MECS document cannot conveniently be represented as a tree, so that MECS processors lack the assistance provided to SGML/XML processors by the unifying assumption of a simple standard data structure for the document. We propose a data structure for representing documents with overlapping structures (including MECS documents). As in the conventional tree representation of SGML and XML, elements are represented by nodes in a graph, and the character data content of the document by labels on the leaves of the graph. We use a directed acyclic graph in which an arc a → b indicates that node b is a child of node a. Unlike SGML/XML trees, our graph structure allows children to have multiple parents. In the general form of the data structure, an ordering is imposed on the children of each node; this gives the data structure its name: general ordered-descendant directed acyclic graph (GODDAG). A restricted form of GODDAG, in which an ordering is imposed on the leaves of the graph, cannot handle multiple orderings of the same material but can represent any legal MECS document. The data structure here proposed should be useful in the representation of naturally occurring documents with complex structures; it may also be useful in other applications.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Norvège</li>
<li>États-Unis</li>
</country>
<region>
<li>Massachusetts</li>
</region>
<settlement>
<li>Cambridge (Massachusetts)</li>
</settlement>
<orgName>
<li>Massachusetts Institute of Technology</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Massachusetts">
<name sortKey="Sperberg Mcqueen, M" sort="Sperberg Mcqueen, M" uniqKey="Sperberg Mcqueen M" first="M." last="Sperberg-Mcqueen">Michael Sperberg-McQueen</name>
</region>
<name sortKey="Sperberg Mcqueen, M" sort="Sperberg Mcqueen, M" uniqKey="Sperberg Mcqueen M" first="M." last="Sperberg-Mcqueen">Michael Sperberg-McQueen</name>
</country>
<country name="Norvège">
<noRegion>
<name sortKey="Huitfeldt, Claus" sort="Huitfeldt, Claus" uniqKey="Huitfeldt C" first="Claus" last="Huitfeldt">Claus Huitfeldt</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000217 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000217 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Ticri
   |area=    TeiVM2
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:1A6F6F3249B03EFD8B9CE8C5325578432CA83E2C
   |texte=   GODDAG: A Data Structure for Overlapping Hierarchies
}}

Wicri

This area was generated with Dilib version V0.6.31.
Data generation: Mon Oct 30 21:59:18 2017. Site generation: Sun Feb 11 23:16:06 2024